Study of integration of statistical model-based voice activity detection and noise suppression

نویسندگان

Masakiyo Fujimoto

Kentaro Ishizuka

Tomohiro Nakatani

چکیده

This paper addresses robust front-end processing for automatic speech recognition (ASR) in noisy environments. To recognize the corrupted speech accurately, it is necessary to employ robust methods against various types of interference. Usually, noise suppression (NS) is used for the front-end processing of ASR in noise. Voice activity detection (VAD) is also used for front-end processing to reduce the redundant non-speech period. VAD and NS are typically combined as series processing. However, VAD and NS should not be assumed to be a separate technique, because the output information of these methods be mutually beneficial. Thus, we investigate the integrated front-end processing of VAD and NS, which can utilize each others’ inputoutput information. The evaluation is carried out by using a concatenated speech corpus, CENSREC-1-C. In the evaluation, the proposed method improves ASR accuracy compared with conventional series combination.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Algorithm for Voice Activity Detection Based on Wavelet Packets (RESEARCH NOTE)

Speech constitutes much of the communicated information; most other perceived audio signals do not carry nearly as much information. Indeed, much of the non-speech signals maybe classified as ‘noise’ in human communication. The process of separating conversational speech and noise is termed voice activity detection (VAD). This paper describes a new approach to VAD which is based on the Wavelet ...

متن کامل

A study of mutual front-end processing method based on statistical model for noise robust speech recognition

This paper addresses robust front-end processing for automatic speech recognition (ASR) in noise. Accurate recognition of corrupted speech requires noise robust front-end processing, e.g., voice activity detection (VAD) and noise suppression (NS). Typically, VAD and NS are combined as one-way processing, and are developed independently. However, VAD and NS should not be assumed to be independen...

متن کامل

Voice-based Age and Gender Recognition using Training Generative Sparse Model

Abstract: Gender recognition and age detection are important problems in telephone speech processing to investigate the identity of an individual using voice characteristics. In this paper a new gender and age recognition system is introduced based on generative incoherent models learned using sparse non-negative matrix factorization and atom correction post-processing method. Similar to genera...

متن کامل

A statistical model-based voice activity detection using multiple DNNs and noise awareness

In this paper, we propose the ensemble of deep neural networks (DNNs) by using acoustic environment classification for statistical model-based voice activity detection (VAD). Since conventional decision functions for statistical model-based VAD are based on shallow model and it cannot take an advantage of the diversity of the space distribution of features, we present to use the multiple DNNs s...

متن کامل

A priori SNR estimation and noise estimation for speech enhancement

A priori signal-to-noise ratio (SNR) estimation and noise estimation are important for speech enhancement. In this paper, a novel modified decision-directed (DD) a priori SNR estimation approach based on single-frequency entropy, named DDBSE, is proposed. DDBSE replaces the fixed weighting factor in the DD approach with an adaptive one calculated according to change of single-frequency entropy....

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2008

Study of integration of statistical model-based voice activity detection and noise suppression

نویسندگان

چکیده

منابع مشابه

A New Algorithm for Voice Activity Detection Based on Wavelet Packets (RESEARCH NOTE)

A study of mutual front-end processing method based on statistical model for noise robust speech recognition

Voice-based Age and Gender Recognition using Training Generative Sparse Model

A statistical model-based voice activity detection using multiple DNNs and noise awareness

A priori SNR estimation and noise estimation for speech enhancement

عنوان ژورنال:

اشتراک گذاری